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Abstract — In the algebraic view, the solution to a network 
coding problem is seen as a variety specified by a system of 
polynomial equations typically derived by using edge-to-edge 
gains as variables. The output from each sink is equated to its 
demand to obtain polynomial equations. In this work, we propose 
a method to derive the polynomial equations using source-to- 
sink path gains as the variables. In the path gain formulation, 
we show that linear and quadratic equations suffice; therefore, 
network coding becomes equivalent to a system of polynomial 
equations of maximum degree 2. We present algorithms for 
generating the equations in the path gains and for converting 
path gain solutions to edge-to-edge gain solutions. Because of the 
low degree, simplification is readily possible for the system of 
equations obtained using path gains. Using small-sized network 
coding problems, we show that the path gain approach results 
in simpler equations and determines solvability of the problem 
in certain cases. On a larger network (with 87 nodes and 161 
edges), we show how the path gain approach continues to provide 
deterministic solutions to some network coding problems. 

Index Terms — Algebraic network coding, Network coding, 
Scalar linear network coding. 



I. Introduction 

THE idea of network coding over error-free networks, 
pioneered in (T], has been a subject of active current 
research. The general idea of linear network coding, where 
intermediate nodes linearly combine incoming packets, was 
explored in ||2). A simple and effective algebraic formulation 
of the general network coding problem was introduced in 
|(3). This established a direct connection between a network 
information flow problem and an algebraic variety over the 
closure of a finite field. 

Using the formulations of G), p), the multicast network 
coding problem, where one source transmits at the same rate 
to a set of sinks, has been characterized almost completely. 
A linear network code exists for the multicast case in a large 
enough finite field and can be found in polynomial time B). 
The insufficiency of linear coding in the non-multicast case 
has been demonstrated in [5|. Recent work in Q and (7J 
has shown the restrictions imposed on the field characteristic 
for the scalar linear solvability of a general network coding 
problem. See [6| for more non-multicast examples. 

In the algebraic view, the network code is seen as a variety 
specified by a system of polynomial equations in multiple 
variables taking values from a finite field [3| [6|. To derive the 
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equations corresponding to a given network coding problem, 
edge-to-edge gains are assigned as variables. For every node, 
the flow on outgoing edges is written down in terms of the 
flows on the incoming edges using the edge-to-edge gains. 
The flow propagates in this manner from the sources to the 
sinks. The output from the sink is equated to its demand, and 
polynomial equations in the edge-to-edge gains are obtained. 

In this work, we propose a method to derive the equations 
using path gains as the variables. The gain on every source 
to sink path becomes a variable in the proposed formulation. 
In the method of [3], the path gain would be a product of 
several edge-to-edge gain variables. The advantage of the path 
gain formulation is that the final equations are only linear and 
quadratic, as shown in the remainder of this article. Because 
of the low degree and the inherent nature of the scalar linear 
network coding problem, simplification is readily possible for 
the system of equations. We provide an algorithm to compute 
the equations in the path gain formulation, and demonstrate 
the efficacy of the path gain approach by illustrative examples. 
Starting with the butterfly network and other interesting small- 
sized network coding problems, we show that the path gain 
approach provides results on solvability of the problem. On a 
larger network (with 87 nodes and 161 edges), we show how 
the path gain approach continues to provide solutions to some 
network coding problems. 

The path gain formulation is equivalent to the edge-to- 
edge gain formulation and can be derived from it. Therefore, 
the work presented in this article is a method to simplify 
the equations generated by the edge-to-edge gain variable 
assignment. While the number of variables in the edge-to- 
edge formulation is of the order of the number of edges, the 
number of monomial terms in these variables is exponential in 
the number of edges. Hence, the polynomial system is of size 
that can be exponential in the size of the network. Assigning 
variable names to the paths (which can be exponential in the 
size of the network in number) does not necessarily make the 
path gain formulation more complex than the edge-to-edge 
gain formulation as far as solving the equations is concerned. 
However, in an actual implementation, the edge-to-edge gains 
are to be used. To complete the path gain formulation, we 
provide an algorithm to compute the edge-to-edge gains from 
the path gains. 

Though there are several other standard methods to simplify 
systems of polynomial equations (such as Grobner basis 
methods), many problems in the area of solving systems of 
polynomial equations (and in network coding with multiple 
sources and sinks) are either NP-hard or undecidable. In this 
light, the path gain formulation appears to be simpler than the 
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edge-to-edge formulation in the sense that simplifications and 
solutions are easier in several examples (both small and large). 

Several methods and techniques to study the network coding 
problem have been introduced by many researchers in this 
area that has seen intense recent research activity. Following 
the information-theoretic methods in |l], more information- 
theoretic methods were used for characterizing network coding 
for multimessage unicasts in (SI. The algebraic formulation 
in j3] provided an elegant and powerful method to study 
network coding. Random network coding j9), which is a 
popular choice in practical implementations, was introduced 
and studied using algebraic tools. The linear programming 
formulation has seen applications in wireless network coding 
[10 1 and optimizing network coding with a cost criterion fTT) . 
The combinatorial approach, proposed and developed in | |1 2[ 
and JT3) , has provided methods for studying the field sizes in 
network coding problems. 

In the context of the prior work cited above, the path 
gain formulation for algebraic network coding presents the 
equivalence between network coding and a maximum-degree-2 
system of polynomial equations for the first time. The equiva- 
lence is achieved without introducing any new monomial terms 
that are not present in the original system. The equations 
obtained from the path gain formulation are amenable to 
considerable simplification in several cases of interest. Hence, 
the path gain method can provide deterministic solutions to 
several linear network coding problems. The method can, 
in some cases, provide results on solvability. The primary 
utility of the method is likely to be in larger examples. As 
an illustration, for the network (in Fig. [5]l with 87 nodes and 
161 edges, we present results of solutions to certain network 
coding problems with multiple sources and sinks in Section 

The rest of this article is organized as follows. We will start 
with a notational description of the network coding problem 
in Section [II] which also introduces the edge-to-edge gain 
algebraic formulation of (3). The path gain formulation is 
presented in Section [III] where we provide a graph transfor- 
mation algorithm that is used to represent and compute the 
equations in a transformed graph. At the end of the section, we 
show how the equations derived from path gain variables are 



amenable to easy simplifications. In Section IV we illustrate 
the advantages of the path gain formulation using various 
example networks drawn from the literature. We also provide 
results for a large Internet Service Provider (ISP) network. In 
Section [V] we give an algorithm (that uses the transformed 
graph) to derive the edge-to-edge gains from the path gains. 
Finally, we provide concluding remarks in Section [Vl] 



II. The Network Coding Problem 

The communication network is modeled as a directed, 
acyclic multigraph, G = (V, E), where the node set V 
represents the terminals and switches in the network and the 
edge set E represents the communication links. It is assumed 
that all communication links are error-free and have unit 
capacity. 



A. Notation 

For a given edge e = (u,v), we denote: 

u = tail(e) 
v = head(e) 

For each node v £ V , we define 

I(v) = {ee£: head(e) = v}, 
0(v) = {ee E : tail(e) = v}. 

Let us further assume the following without loss of gener- 
ality: 

1) A node v is a source node iff \I(v)\ = and all source 
nodes produce exactly one unit of data per unit time. 

2) A node v is a sink node iff |0(t>)| = and all sink 
nodes demand exactly one unit of data per unit time. 

In cases where a node v produces (demands) more than 
one data symbol, we can add virtual source (sink) nodes that 
produce (demand) exactly one data symbol, have exactly one 
output (input) link connecting them to v and no input (output) 
links. 

Then, the set of source and sink nodes is defined as follows: 



S={v€V :\I(v)\=0} = { Sl ,s 2 , 



}• 



»|S| 

T = {veV:\O(v)\=0} = {t 1 ,h,...,t m }. 

Let the sink tj demand the s(j)-th source. For every source s, a 
virtual incoming edge e(s) is added for notational convenience 
(as in the edge-to-edge gain formulation |3|). 

Let us now assume that we use a finite alphabet H. For 
each edge e, an edge function is then defined as a mapping 
f e : H l ->■ H, where i = 1 if tail(e) £ S and i = |J(tail(e))| 
otherwise. For a sink t, a virtual outgoing edge e(t) is added 
to denote the output. The edge function on this virtual edge, 
which is a mapping denoted f t : ff^CO! -> H, is called the 
output function of the sink. 

Definition 1: The collection of all the edge functions in a 
given network is defined as a network code. If all the edge 
functions are linear maps with respect to a field alphabet H, 
then the code is a scalar linear code. 

Let the data symbol generated at the i-th source node, <E 
S, be denoted by Xi. The data symbol demanded by the j-th 
sink node, tj £ T, is X s uy The sources and sinks implicitly 
define a set of connection requirements for the given network 
G. The connection requirement is met at a sink tj if the output 
of the function f t . equals X s rj\ for all inputs. 

Given a network G, the set of source nodes S and the set of 
sink nodes T, the network coding problem is to determine all 
the edge functions such that all the connection requirements 
are satisfied. If such a set of edge functions exists, then 
the network coding problem is solvable. If a set of linear 
edge functions (with respect to a finite field H) exists that 
satisfies all the connection requirements, then the network 
coding problem is scalar-linearly solvable. 

In a scalar linear network coded flow (over a field H), the 
edge function of an edge e can be written as JjL L ajXj, where 

^ I o I 

di £ H. We refer to YliJi a i^i as either the edge function of 
e or the symbol flowing through e and denote it as a vector 
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f e = [oi fl2 ••• 0|s|]- Similarly, the output function at the 
sink has a vector notation. 

B. Koetter-Medard formulation and edge-to-edge gains 

Solving the scalar linear network coding problem was 
formulated as a problem of solving a system of polynomial 
equations by Koetter and Medard in (3). The idea is to 
construct the linear edge function f e for an edge e = (it, v) 
recursively as follows: 

e'el(u) 

where a e >. e is an edge-to-edge gain. To start the recursion, 
the edges out of a source node s, are assigned the unit 
coding vector with a 1 in the i-th position. The edge function 
for the remaining edges, found using ([TJ, become vectors of 
polynomials in the edge-to-edge gains a e \ e . Finally, at a sink 
tj, the output edge function is equated to the unit vector with 
a 1 in the s(j)-th position. These equations form a polynomial 
system in the edge-to-edge gains a e ', e for e' E /(tail(e)). 

The Koetter-Medard algebraic formulation is illustrated for 
the case of the modified butterfly network shown in Fig. [T] with 
two sources and four sinks. Note that the network in Fig. [T] is 
identical to the classic butterfly network under our definition of 
sources and sinks. The edge functions under the assignment 
of edge-to-edge gains (as in (3)) are shown in Fig. [T] The 
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gains used in |3|. As shown in the remainder of the paper, the 
path gain approach results in considerable simplifications in 
several cases. 

III. Algebraic Formulation Using Path Gains 

The main idea in the proposed formulation is to use path 
gains instead of edge-to-edge gains as variables and obtain 
a system of polynomial equations. We begin by showing a 
derivation of the path gain formulation from the edge-to-edge 
gain formulation. 

A. Derivation from Koetter-Medard formulation 

Let the output edge function at the sink tj demanding source 
s(j) be f t . = [«?(!) 5 (2) • • • g(\S\)}. If P = ( ei e 2 ■ • • e t ) is a 
path from the source virtual incoming edge e(si) — e\ to 
the sink virtual outgoing edge e(tj) — ei, the polynomial 
g(i) contains a path gain term a(P) = [] „ Q: em _ ie , m . 
Conversely, each term in the polynomial g(i) is the gain along 
a path from the source edge e(si) to the sink edge e(tj). 

In the proposed formulation, the path gain of a path P 
from a source input virtual edge to a sink output virtual 
edge is assigned as a variable denoted a(P). Suppose there 
are paths, denoted Pfjk (1 < k < Nij), from e(si) 
to e(tj). We see that the polynomial g(i) can be written as 

The proposed approach can be summarized as follows. 
Equating the output edge function at the sinks to unit vectors, 
the equations in the Koetter-Medard formulation become linear 
in the new path gain variables. We call these conditions as 
no-interference conditions. However, if two paths overlap in 
one or more edges, there are inter-relationships between the 
path gain variables. These inter-relationships are called edge 
compatibility conditions, and they turn out to be quadratic in 
the path gain variables. 

A simple description of the edge compatibility conditions 
is as follows. If two source-sink paths P = P\eP 2 and 
Q = Q1C-Q2 overlap in an edge e, we see that the relationship 
a(P)a(Q) = a(PieQ2)a(QieP2) needs to be satisfied, since 
both sides are equal to a(Pie)a(eP2)a(Q 1 e)a(eQ2). Note that 
P\eQi and Q\ePi are source-sink paths as well. However, 
several of these equations can be combined to produce the nec- 
essary set of edge compatibility conditions. This is described 
in more detail in the Section UlI-EI 



Fig. 1. Flow in the butterfly network. 

formulation described in {JJ gives the following 8 equations 
in 10 variables: 

«3 + U±Ot\ = 1 OL^OL2 = 

a 5 + a 6 ai = o^Q^ = 1 

U-iCt2 + «8 = OL-lOL\ = 1 

Q;gQ!2 + Q?io = 1 CtgOti = 

In this work, we propose methods to simplify the algebraic 
formulation for the general scalar-linear network coding prob- 
lem through the use of path gains as opposed to edge-to-edge 



B. Constructing source-sink paths as trees 

To work with the path gain formulation for a given network 
and connection requirements, we need to determine source- 
sink paths and assign path gain variables. Then, the no 
interference conditions and the non-trivial edge compatibility 
conditions have to be determined. In the remainder of this 
section, we provide algorithms for performing these tasks. In 
these algorithms, we employ a graph transformation that is 
very useful in both visualizing the path gain approach and 
solving for the edge-to-edge gains from the path gains. 

An important ingredient in the algorithms is an ordering 
of the nodes. Every directed acyclic network determines a 
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topological order or sequencing of nodes from sources or 
sinks or vice versa. A standard algorithm for finding such 
a topological ordering of the nodes is given below fl4 j for 
completeness. 

Algorithm 1: Topological Sorting 
Input: A directed acyclic graph, G = (V,E). 

1) Associate with each node v, a value N(v) that is 
initialized to |0(w)|. 

2) Pick a node v such that N(v) = 0, do 

• For each edge e £ I(v), 
AT(tail(e)) <- iV(tail(e)) - 1. 

. N(v) < 1 

• Append v to the ordering, P. 

3) If any node has not been added to the ordering yet, go 
to Step 2. Else terminate. 

Output: P, a topological ordering of nodes. 
Notice that the sinks occur first in the topological ordering. 
Loosely, the ordering traverses the nodes from sinks to sources. 
The final algorithm that takes a network coding problem as 
input and outputs a set of trees that collect together all source- 
sink paths is given below: 

Algorithm 2: Graph Transformation 
Input: A directed acyclic graph G = (V, E), set of sources S, 
set of sinks T, connection requirements C. 

1) Obtain a topological ordering P for the graph G = 
(V,E) using Algorithm 1. 

2) Let G'(V',E') = G(V,E). 

3) Loop through the nodes v £ V in the order defined by 
P, do 

. If 0(v) > 1, 

- for each edge e £ 0(v), add a new node v' to 
V' with one output link connecting it to head(e) 
and one input link e' for each e" £ I(v) such 
that tail(e') = tail(e"). 

- Delete the old node v in V . 
Output: G' — (V , £"), a transformed network. 

Theorem 2: The final transformed network is a set of |T| 
directed trees {Xi , Ta, • • • , Tj } such that Sink tj is the root of 
the j-th tree. All leaf nodes in the trees are copies of one of the 
source nodes. There is a one-to-one correspondence between 
the paths from leaf nodes, which are copies of the source Sj, 
to the root in Tj and the paths from Si to tj in the original 
network. 

Proof: Each node in the transformed network will have 
exactly one output link and the acyclic property of the graph is 
maintained by the transformation. The underlying undirected 
graph is a set of disjoint trees, because any cycle in it must 
imply that either the cycle is also present in the directed graph 
or that one of the nodes in the directed graph has more than 
one output link. Hence, the equivalent network is made up of 
a set of directed trees. 

The transformation maintains one output link for each node 
in the original graph that has |0(u)| > 1. So, the only nodes 
that will have |0(f)| = 0, and hence be the roots of these 
trees, are the sink nodes (which had |0(w)| = to start with). 
Hence, each sink would be the root of a directed tree in which 
all edges are directed towards this root. 



Also, the number of input links of a copied node in the 
transformed graph is equal to the number of input links 
possessed by the original node. So, the only nodes that will 
have \I(v)\ — 0, and hence be leaf nodes in these trees, are 
copies of the source nodes (which had \I(v)\ = to start 
with). 

Finally, since all nodes are visited in the topological order 
from the sinks to the sources, all paths from the sinks to the 
sources will be part of the final network. This results in the 
one-to-one correspondence in the paths. ■ 

An example of this transformation applied to the butterfly 
network (Fig. [2^) can be seen in Fig. |2j3. To apply the graph 
transformation, the topological ordering of the nodes is chosen 
to be 7-8-9-10-5-6-4-3-1-2. Nodes 7, 8, 9 and 10 
are sink nodes, and occur first in the ordering. Nodes 5 and 6 
will be replicated 2 times, since they both have 2 output links. 
This will result in the replication of the edges e^, e$, e§ and 
er- Node 4 will now have 4 output links and will have to be 
replicated as many times along with edge e 3 . Similarly, Node 
3 will also be replicated 4 times along with edges e\ and e 2 . 
Finally, the source nodes 1 and 2 will be replicated 6 times 
each since they both now have 6 output links. 

C. Path gain variables and edge functions 

Since there is a one-to-one correspondence between the 
leaf source nodes in the transformed network and the source- 
sink paths in the original network, path gain variables are 
assigned at the leaf source nodes. The assignment is illustrated 
in Fig. [2]3 for the butterfly network. Source nodes 1 and 2 
are assigned the variable names a and b, respectively. The 
subscripts are chosen tree by tree in the transformed network. 
In the tree with root as Node 7, the two copies of source node 
1 are assigned variables a% and a 2 , while the single copy of 
source node 2 is assigned the variable b\. In the tree with root 
node 8, the variables are 03, 04 for the two copies of Node 
1, and b 2 for the single copy of Node 2. We continue in this 
manner to name the scaling variables at the source leaf nodes 
of the other two trees to get variables a\, a 2 , • • • , a 6 and b\, 
b 2 , ■ • • , b 6 . 

Once path gain variables are assigned (from some field) at 
the leaf nodes, all edge functions are computed in the trans- 
formed network assuming that intermediate nodes perform 
addition only. The output function at the root (sink) is the sum 
of all incoming edge functions. For instance, in the tree with 
root as Node 7 in Fig. |2j), the edge functions are as follows: 
for e\, a 2 Xi, for e 2 , b 1 X 2 ; for e% and e^, a 2 Xi + biX 2 ; for 
64, aiXi, for eg, the edge function is (dj + a 2 )Xi + b^X 2 . 
The output function at sink node 7 is (ai + 02)^1 + dis- 
similarly, the edge functions can be computed for the other 
trees. Note that the intermediate nodes perform addition as the 
entire path gain has been assigned as a variable at the leaf. 

D. No Interference conditions 

Because of the equivalence between paths from sources to 
a sink tj in the original network and leaf nodes in the tree Tj, 
we see that the output function calculated in the transformed 
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network is identical to the output function in the Koetter- 



Medard formulation as given in Section III-A Therefore, the 



no interference conditions are obtained by equating the output 
function of the root in the transformed network to its demand. 

In Fig. the output function at the root nodes 7, 8, 9 and 
10 are (a\ + a 2 )Xi + b\X 2 , (a 3 + a^jXi + b 2 X 2 , a 5 X ± + 
(63 + b4)X 2 and a§Xi + (65 + bo)X 2 , respectively. For the 
symbol at Node 7 to be equal to the required X\, we have 
a>i +0,2 — 1 and 61 = 0. Other equations are derived similarly. 
Hence, in the butterfly network of Fig. [2] we get the following 
linear equations: 



ai + a 2 

a 3 + a 4 

a 5 = 1 
a 6 = 



61 = 
b 2 = 1 
63 + 64 = 
65 + b 6 = 1 



(2) 



For completion, we state the general form of the no- 
interference conditions below. In general, each path gain 
variable in the transformed network is associated with exactly 
one source symbol and one sink (or tree). Let us denote the 
source-sink path gains by a^fc where i E {1, . . . , \S\} denotes 
the source, j E {1, . . . , |T|} denotes the sink (or the tree), 
and k E 1, . . . , 2Vy is an index among all copies of the source 
node Si in the tree Tj rooted at tj. Then, the general form of 
the "No Interference" conditions can be written as follows: 



■ijk 



if S(j) = Si 

otherwise 



(3) 



E. Edge Compatibility conditions 

As explained before, the path gain variables of overlapping 
paths are related by quadratic edge-compatibility conditions. If 
multiple copies of an edge are present in the transformed net- 
work, then the edge is part of multiple source-sink paths in the 
original network. Therefore, edge compatibility conditions are 
indicated by the presence of multiple edges in the transformed 
network. The edge functions in the transformed network can be 



used to write down the edge compatibility conditions. We first 
show this for the butterfly network example and later provide 
the general form. 

In our illustrative example of Fig. J2j?, the edge e 3 is copied 
four times. Since there are (Jj) = 6 ways of choosing two 
copies among the four, there will be six edge compatibility 
conditions for 63. The symbols on the copies of 63 on the 
trees with root nodes 7, 8, 9 and 10 are a 2 Xi +biX 2 , a^X\ + 
b 2 X 2 , (I5X1 + b 3 X 2 and a§Xi + b$X 2 , respectively. Hence, 
in fractional form, we need — = f 1 (roots 7 and 8), — = I 1 

a& b-2 v 7 ' as 63 

(roots 7 and 9), 2a = fi (roots 7 and 10), ^ = ^ (8 and 9), 
% = I (8 and 10) and = | (9 and 10). 

In the degree-2 form, the edge compatibility conditions for 
the four copies of the edge e$ are listed below: 



a 2 b 2 = a 4 &i 

02^5 = 06^1 

0465 = a e b 2 



a 2 b 3 = 0564 
0463 = a 5 b 2 
0565 = a 6 b 3 



(4) 



For the butterfly network example, we do not get any other 
edge compatibility conditions. For edges e§ and e*i, the equa- 
tions are identical to the ones listed above. Also, there are no 
equations for edges e\, e 2 , e± and e$ since these edges have 
scaled versions of the same symbol flowing through them. 

We have seen that not all duplicated edges result in distinct 
compatibility conditions. In general, edge compatibility equa- 
tions will be required for each edge e in the original network 
that satisfies the following conditions: 

1) Number of copies of head(e) in the transformed network 
> 1 (or the edge will not be replicated at all). 
Number of different source nodes having a path to 
e > 1 (since if two copies of e carry a\X\ and a 2 Xi, 
these will be scalar multiples of each other for any value 
assigned to ai,a 2 ). 

|J(tail(e))| > 1 (or the equations will be same as that 
for e' E /(tail(e))). 
We now state the general form of the edge-compatibility 
conditions in terms of nodes of the transformed network. 
Given a node v G V in the original network, the general form 



2) 



3) 
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of the condition for two copies of v, denoted by v\ and v 2 in 
V', belonging to the ji-th and j 2 -th trees, respectively, can be 
written as follows: 



, keh 




(5) 



where hij(v) denotes the set of leaf nodes in the j-th tree that 
are copies of the source node Si and have a path to v. 

A careful study of the general form shows an edge com- 
patibility condition needs to be introduced for every two 
copies, v\,V2 G V, of node v € V and for every two 
sources 8^,8^ £ S such that (a) |/(ui)| > 1, (b) V\ E 
Vj 1 , v 2 € Vj 2 , Vj i — Set of nodes in the ji-th tree, and (c) 
hiih(vi) ^ <t>, h i2jl (vi) ^ (j). 

The linear no-interference conditions and the quadratic edge 
compatibility conditions on the path gains are necessary and 
sufficient conditions for existence of solutions to the scalar 
linear network coding problem. The sufficiency is proved by 
Algorithm [3] in Section [V] Before describing the sufficiency, 
we show how the linear and quadratic equations in path gains 
can be simplified in a systematic manner to provide useful 
results. 



F. Simplifying the equations 

The linear equations (No Interference conditions) possess 
the special property that each of them involves a mutually 
exclusive set of variables. Using this property, we can simplify 
the system of equations in the following two ways: 

1) It is possible that some of the variables never occur in the 
non-linear equations (Edge Compatibility conditions). 
From Q, we can see that a\ is one such variable in 
the example of the butterfly network. It can be easily 
seen that the linear equation involving <Zi can be trivially 
satisfied for any value assigned to the other variables 
involved in the same linear equation by choosing an 
appropriate value of a\ (which does not have any other 
condition on it). Hence, ai along with the linear equation 
it occurs in can be removed from the system as trivially 
solvable. 

Therefore, the first simplification would involve elimi- 
nation of variables (and their corresponding linear equa- 
tions) that do not occur in any non-linear equation. 

2) Since each linear equation involves a mutually exclusive 
set of variables, we can eliminate one variable using each 
linear equation easily. Eliminating this variable from the 
non-linear equations (note that this does not increase the 
degree of the system) might reduce some of them to 
linear equations which can again be used to eliminate 
more variables iteratively. 

In the case of the butterfly network, after the first step of 
simplification, we are left with 8 variables, 4 linear equations 
and 6 non-linear equations. 



In the second step of the simplification, after the first round 
of elimination of variables using the linear equations Q in 
Q, we are left with 4 variables: a 2 , a 4 , 6 3 and b 5 and the 6 
equations as shown below. 



a 2 = 
a 2 b 5 = 
<24&5 = 



b 5 =0 
a 2 b 3 = 
a 4 b 3 = 1 



Subsequently, a 2 and 65 can also be eliminated, using the lin- 
ear equations above, leaving just 2 variables and the relation: 



a A b 3 



1 



(6) 



Hence, the network coding problem for the example of the 
butterfly network has been reduced to solving only one (non- 
trivial) equation given in d6j. 

IV. Illustrative Examples 

In this section, we provide a few examples to illustrate the 
usefulness of the path gain approach in deriving the system 
of polynomial equations corresponding to a network coding 
problem. Note that several problems in this area of polynomial 
equations and network coding are NP-hard or undecidable, and 
we do not expect polynomial-time algorithms and exact step- 
by-step solutions to result from the path gain approach. Our 
approach is to demonstrate the effectiveness of the path gain 
method in several examples of varying complexity. 

In all examples, we provide the number of equations and 
variables obtained from the edge-to-edge gain formulation. 
The path gain formulation (after simplifications) will result in 
better numbers in many cases. However, we point out that this 
is not a comparison of the two methods, since one is simplified 
and the other is not. As we have shown, the path gain method 
can be seen as a method for simplifying the edge-to-edge 
gain equations. Other methods for simplifying generic systems 
of polynomial equations, such as Grobner basis methods, 
are useful in several networks. Also, Grobner basis or other 
methods can be used after the path-gain-based simplifications. 
However, in many examples, we observe that the path gain 
formulation appears to provide results on solvability. This is 
mainly because the path gain approach provides low degree 
equations, which are amenable to easy analysis and further 
simplifications. 

A. Illustration of derivation from Koetter-Medard formulation 

For the butterfly network, the relationship between the path 
gain variables (shown in Fig. [2}}) and the edge-to-edge gains in 
the Koetter-Medard formulation (Fig. [TJ can be written down 
as follows: a\ = a 3 (path: 1 — 5 — 7), a 2 = 04O1 (path: 
1 — 3 — 5 — 7), 61 = 0:40:2 (path: 2 — 3 — 4 — 5 — 7), a 3 = o 5 
(path: 1 — 5—8), a 4 = a$ai (path: 1 — 3 — 4—5 — 8), b 2 = aga 2 
(path: 2 - 3 - 4 - 5 - 8), 05 = oio 7 (path: 1-3-4-6-9), 
63 = a 2 a 7 (path: 2 - 3 - 4 - 6 - 9), 64 = a 8 (path: 2-6-9), 
a e = oiOg (path: 1 — 3 — 4 — 6 — 10), b 5 = a 2 a 9 (path: 
2-3-4-6 - 10), b 6 = a 10 (path: 2 - 6 - 10). 

The no-interference conditions are easily obtained. For edge 
compatibility between the paths 2 — 3 — 4 — 5 — 8 and the path 
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Fig. 3. (a) An example network that is solvable only over fields with characteristic 2. There are three sources -1,2 and 3 - producing symbols X\, X<z and 
X3 respectively. There are three sinks - 12, 13 and 14 - demanding symbols X3, X\ and X2 respectively. (b),(c),(d) The final transformed network with 3 
trees - one for each sink node. 



1—3—4—6—9, we get the equation b 2 a^ = ^304 = ct§a 2 a.\a 7 . 
Other compatibility conditions can be checked similarly. 

The change to path gain variables results in easy simplifi- 
cation of the resulting equations with no increase in degree. 
Finally, we obtain the simple equation, a±b 3 = 1, which is not 
obvious even when the substitution is clearly specified. 

B. Another Example 

Consider the network shown in Fig. [3^ taken from |5j, |6j, 
where it has been proved to have linear coding solutions only 
over fields of characteristic 2. Nodes 1, 2 and 3 are sources 
producing X\, X 2 and X 3 respectively. Nodes 12, 13 and 14 
are sinks demanding X 3 , X\ and X2 respectively. The trees in 
the equivalent transformed network are shown in Fig. [3j),c,d. 

The set of equations generated by the "No Interference 
condition" are: 

Node 12: a x + a 2 = 0; 61 + b 2 = 0; c x = 1 
Node 13: a 3 = 1; 63 + 64 = 0; c 2 + c 3 = 
Node 14: a 4 + a 5 = 0; b 5 + b 6 + b 7 = 1; c 4 + c 5 = (7) 

The set of equations generated by the "Edge Compatibility 
condition" for edges e\, e 2 , e 3 and 64 respectively are: 

d : a 2 b 3 = a 3 bi; a 2 6 5 = a A bi,a 2 b 7 = a 5 bi, 
a 3 6 5 = 0463; 0367 = a 5 6 3 ; a 4 6 7 = a 5 6 5 

&2 ■ a 2 (b 5 + b 6 ) = a 4 (6i + b 2 );a 2 c 4 = a 4 ci; 
(h + b 2 )c 4 = (65 + b 6 )a 

e 3 : a 3 b 7 = a 5 b 3 ;a 3 c 5 = a 5 c 2 ; 6 3 c 5 = 6 7 c 2 

e 4 : b 2 c 3 = 64C1; 62C4 = b e ci;biC4 = b 6 c 3 (8) 

Using the linear equations to eliminate variables iteratively, 
we get 9 equations in 6 variables shown below. 

02^3 = h] «2 = -a>4,h; a 4 6 3 = -1; 

a 2 c 4 = a 4 ; c 4 = a4C 2 ; 6 3 c 4 + c 2 = 0; 

bic 2 + b 3 =0; b lCi + 1 = 0; b 3 c A = c 2 (9) 

From equations 63C4 + c 2 = and & 3 C4 = c 2 , we can derive 
the relation 2c 2 = 0. Substituting c 2 = in the above system 
leads to the condition 1 = 0, which is not possible. Hence, we 
must have 2 = 0, which implies that the system is not solvable 
in any field with an odd characteristic. Also, in characteristic 2, 



setting all variables to 1 in the above equations, is seen to be a 
solution. This example demonstrates that, in practice, working 
with the equations derived through the path gain formulation 
can be advantageous. 

For this example, the Koetter-Medard formulation, as illus- 
trated in (6), results in 17 equations in 22 variables. However, 
as shown in (6j, it is possible to derive 2 = from these 17 
equations using other simplifications. Alternatively, a Grobner 
basis method can also be used to derive 2 = 0. The path 
gain approach should be seen as a generic technique for 
simplification that can be used in arbitrary network coding 
problems, as shown in the next two examples. 



C. Multicast example 

An interesting example of a multicast problem, presented 
in p5) , is shown in Fig. [4] The sources are nodes 1 and 2, 




Fig. 4. Multicast example. 

and the sinks are nodes 11-16. This problem does not have a 
binary solution, as shown in fl5j . 

Using the edge-to-edge gain formulation, we get 24 equa- 
tions in 32 variables. The path gain method initially results 
in 84 equations in 48 variables. After the simplifications, we 
obtain 54 equations in 18 variables. Significantly, there are 6 
quadratic equations, each of the form x^+x 3 +xix 2 = 0. Next, 
we can show that x\x 2 — (either x\ — or x 2 — 0) provides 
a contradiction in the equations. Hence, we have equations of 
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the form x 2 + x + 1 = 0, which cannot be solved in the binary 
field. With some more analysis, we can find solutions over 
GF(4). 

From this example, we see that the path gain formulation 
provides useful simplifications in non-trivial cases. In contrast, 
Grobner basis methods on the edge-to-edge gain equations are 
not immediately useful in showing linear in-solvability over 
GF(2). Note that this does not rule out any other simplification 
of the edge-to-edge equations to obtain the necessary result. 
We merely conclude that the path gain method provides a 
useful simplification. 

D. A Bigger Example 

Consider an ISP network topology shown in Fig. [3] taken 
from (16). 

The network has 87 nodes and 161 edges. Edges are directed 
from lower-numbered nodes to higher-numbered nodes i.e. 
in an edge (u,v), u < v. Hence, the graph is directed and 
acyclic. We assume all links have unit capacity, and use fields 
of characteristic 2 in our simplification steps. After directing 
the graph, the five nodes 1, 12, 21, 51 and 64 were set as 
sources in the example problems. Sink nodes and demands 
were chosen at random from among the sources visible from 
each sink. The graph is not reduced by this choice of demands, 
since all nodes are visible from the five chosen sources. 

1) 5 sources (all rate 1), 10 sinks (all rate 1). The edge-to- 
edge gain formulation gives a system of 44 equations in 
30 variables. The path gain formulation initially results 
in 44 linear equations and 3 degree-2 equations in 316 
variables. After applying the simplification steps, we 
are left with only 3 degree-2 equations in 7 variables 
assuming solution exists in a characteristic 2 field. In 
fact, setting all the remaining 7 variables to zero results 
in a valid solution to the three equations (some other 
scaling variables are non-zero). Hence, a solution over 
GF(2) is possible. 

2) 5 sources (one with rate 2, others rate 1), 11 sinks 
(all rate 1). The edge-to-edge gain formulation yields 
a system of 50 equations in 40 variables in this case. In 
comparison, the path gain formulation initially resulted 
in 50 linear equations and 34 degree-2 equations in 330 
variables. But after applying the simplification steps, we 
are left with only 13 degree-2 equations in 17 variables 
assuming solution exists in a characteristic 2 field. 
Again the all-zero solution is valid for the remaining 
17 variables resulting in a network code over GF(2). 

3) 5 sources (all with rate 2), 8 sinks (all rate 1). The 
edge-to-edge gain formulation yields a system of 88 
equations in 180 variables. The path gain formulation 
initially gives 88 linear and 11198 degree-2 equations in 
632 variables. But on applying the simplification steps, 
assuming characteristic 2, it turns out that the system is 
not solvable over characteristic 2. 

To run further examples, we computed the max-flows from 
the sources (1,12,21,51,64) to a set of 11 nodes chosen as 
sinks. Rates below the individual max-flows were assigned to 
sources and sink demands. The results are summarized below 



(Notation: S, T are source and sink sets. A source node s of 
rate R > 1 is shown as R source nodes si, s 2 ■ • ■ sr. The 
demands of each sink are shown in brackets). 

1) S = {li.la.12i, 12 2 ,21,51,64}, T = {15(12i, 12 2 ), 
40(li,12i,21),43(21),49(li),62(li,12i,21),63(12i, 
12 2 ),67(li), 71(21), 82(64), 83(21), 86(21)}. The path 
gain formulation yields 1188 equations in 507 variables. 
After simplifications, there are 476 equations, but none 
of them have a constant term. Hence, setting the 
remaining variables to zero provides a binary solution. 

2) S = {li,l 2 ,12i, 12 2 ,21,51,64}, T = {15(12i, 12 2 ), 
40(l 1 ,l 2 ,12i,21),43(21),49(l 1 ),62(li,12 1 ,12 2 ,21), 
63(12!,12 2 ), 67(10, 71(21), 82(64), 83(21), 86(21)}. 
We obtain 555 variables and 1683 equations. Upon 
simplification, we find that a solution does not exist in 
characteristic 2. 

3) S = {1,12,21 1 ,21 2 ,51,64}, T = {15(1), 40(21 l5 
21 2 ), 43(210, 49(1), 63(12), 67(1), 71(210, 82(64), 
83(210, 86(210}- The path gain formulation yields 578 
variables and 12048 equations. After simplifications, 
there are 6780 equations, but only five of them have a 
constant term. A linear term (one path gain) appears in 
each of these five equations, but does not appear as a 
linear term in any other equation. Hence, setting this 
one path gain to 1 and the remaining variables to zero 
provides a binary solution. 

As expected, solutions are not guaranteed even if all demands 
are within individual max flows. We see that the number 
of equations and variables increases steeply in some cases. 
However, guessing a binary solution may be feasible. 

To obtain another example, we modified the graph 
of Fig. [5] (by changing edge connections) to get a 
butterfly network as a subgraph when the nodes 62 
and 63 demand the sources 1 and 12. On the modified 
graph, we set S = {1, 12i, 12 2 ,21, 51, 64}, T = {15 
(12i,12 2 ), 40(l,12i,21), 43(21), 49(1),, 62(1, 12!, 21), 63(1, 
120, 67(1), 71(21), 82(64), 83(21), 86(21)}, where the other 
demands are chosen to be below the individual max flows. We 
obtain 1247 equations in 503 variables. After simplifications, 
there are no equations with a constant term. So, setting the 
remaining variables to zero results in a binary solution. 

Hence, we see that the path-based formulation of scalar 
linear network coding appears to yield useful results even over 
large networks with a few sources and sinks. This shows the 
extent of simplification possible in polynomial systems defined 
by network coding problems. 

E. Simplification summary 

It has been shown in [17| that the complexity of Grobner 
Basis algorithms depends, among other things, on the maxi- 
mum degree of the starting basis. The degrees of the interme- 
diate polynomials computed during Grobner Basis calculations 
has been shown to grow up to 2 2 if the maximum degree of 
the starting basis is d. Due to these issues, Grobner Basis 
algorithms become practically intractable except for small 
problem instances. In the light of these results, the path 
gain formulation that produces degree-2 equations becomes 



Fig. 5. An ISP network over Europe with 87 nodes and 161 edges. 



important in reducing the running complexity of Grobner Basis 
algorithms that may be used to solve the network coding 
problem. 

The simplification provided by the path gain approach is 
summarized in Table [I] for the various examples presented so 
far. As indicated, we have shown numbers for only one round 

TABLE I 
Comparison of formulations. 



Example 

Butterfly 
Fig. [3k 
3 Fig. 5] 

5 Fig. 3] 

6 Fig. 3] 





Path gain 


Edge gains (unsimplified) 


Var. 1 


Deg. 2 Eqns 1 


Var. 


Eqns 


Deg. 


4 


6 


10 


8 


2 


8 


15 


14 


9 


3 


9 


5 


14 


4 


4 


27 


45 


50 


32 


3 


12 


30 


22 


17 


3 



1 After one iteration of elimination using the linear equations 

of linear equation simplification. It can be seen that, apart from 
having a maximum degree of only 2, the number of variables 
is also lesser in many cases enabling use of methods such as 
| fl"8) for solving the system. 

The number of variables and equations from the edge-to- 
edge gain formulation are of the order of the number of edges 
in the network. However, the number of monomial terms possi- 
ble using the edge-to-edge gain variables is exponential in the 
number of edges. In a large network, depending on the number 
of paths from sources to sinks, a large number of monomial 
terms occur in the system of polynomial equations. Because 
of the large number of variables and larger number of terms, 
there is no obvious method to simplify the equations other 



than running standard routines for Grobner basis. The path 
gain approach is beneficial in providing results on solvability 
in some examples and in reducing the complexity of Grobner 
basis methods in most cases. 

V. Network Code from Path Gain Variables 

While the path gain variables are useful for solving the sys- 
tem of polynomial equations, the implementation of network 
coding is through edge-to-edge gains. We now describe an 
algorithm to obtain a network code for the original network 
from the path gain variables in the transformed network. 
Note that this completes the proof of the sufficiency of edge 
compatibility conditions. 

First, we will briefly describe the algorithm and then present 
a notational version of the same. A solution to the system of 
polynomial equations in the path gain formulation consists 
of a set of values assigned to the path gain variables at the 
leaf source nodes in the transformed network such that the 
no interference conditions as well as the edge compatibility 
conditions are satisfied. The algorithm to construct a network 
code from such a solution consists of propagating the values 
of these coefficients from the source nodes to the sink nodes 
through the transformed network. 

We compute two vectors for every edge e of the graph 
G = (V,E). The first vector f e = [/ e (l) f e (2) ■ ■ ■ f e (\S\)] 
represents the edge function or symbol fe(j)Xj sent 

over edge e. Suppose e is replicated n times to obtain edges 
e", 1 < i < n in the transformed graph G' = (V',E'). The 
second vector c e = h cj • • • c„] is such that the edge 
function on e" £ E' is Ci fe{j)Xj- Note that such a 

scaling property is guaranteed for all copies of an edge by the 
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compatibility conditions. Once the vectors f e are computed for 
all e E E, the network code in G is completely known. 

Suppose i e i and c e > are known for all the incoming edges 
e' E I{v) for a node v E V. The vectors f e and c e can be 
computed for the outgoing edges e E 0(v) as illustrated for 
a sample case in Fig. [6] In the figure, a node v E V with 
J(w) = {ei,e2,es} and O(w) = {e 4 ,es} is replicated thrice 
into i>(l), v(2) and w(3) in G . The incoming and outgoing 
links are assumed to be replicated as shown in the transformed 
network. For instance, the edge ei is replicated thrice as ei(l), 
ei(2) and ei(3). We suppose that there are three source nodes 
S = {sx, S2, S3}, and f e . = [an a i2 0*3], This results in edge 
functions A4 — Y^j=i a ijXj for i = 1,2,3. We assume that 
the scaling vectors c ei are as shown in the figure. 

Using the edge functions and scaling factors on the incom- 
ing edges, the edge function of the copies of e,-, i = 1,2,3 
are computed first. For instance, the edge function of e2(2) is 
computed as 62 Ai- Then, the edge function for the outgoing 
links of v(l), v(2) and i?(3) in G' are computed by simple 
addition. As shown in the figure, the symbols sent on e 4 (l) 
and e 4 (2) will be scalar multiples. We then assign the symbol 
on e 4 in G to be the symbol on e 4 (l) given by J2i=i a i^i — 
2_y=i(2j=i a i a ij)Xj (assumed nonzero). Then, f e4 and c e4 
are assigned suitably. 

In this manner, all the nodes are processed in a suitable order 
to compute the network code for the original graph from the 
path gains on the transformed graph. We now introduce some 
notation to describe the algorithm formally. 

A. Notation 

Consider the given network G — (V,E) and the equivalent 
transformed network G' — (V',E'). Then, for each node 
v E V, let us define the set of network coding coefficients 
as a e / <e V e' E I(v),e E 0(v) i.e. if x e > is the symbol 
received on the link e' E I(v), the symbol sent on e E 0(v) 

i S J2e'£l(v) a e\eX e '- 

Nodes and edges get replicated during the transformation 
from G to G' . We define some sets to hold information about 
the replicated nodes and edges. For v E V (v ^ S U T) and 
e, e! E E, define: 

Rv = {V E V' : v' is a copy of v} 
R e = {e" E E' : e" is a copy of e} 

R e , e = { e " E R e ' '■ 3 e'" E R e so that head(e") = tail(e"')} 

The sets R v and R e hold nodes and edges in G' that are copies 
of v and e , respectively. Two such useful sets are (1) i?head(e) 
that contains copies of head(e), and (2) i? ta ii( e ) that contains 
copies of tail(e). The set R e i :f , contains copies of an edge e' 
that connect to a copy of e. Clearly, i? e ',e is non-empty only 
when e' E I(v) and e E 0(v) for some node v. 

For each e E I(v) in the original graph G, there is a one- 
to-one correspondence between R e = {ex, e%, ■ ■ ■ , e|^ c |} and 
Rv = {vx, V2, ■ ■ ■ , v\r v \} given by Vi — head(ei) in the trans- 
formed graph G' . Thus, we have the equality |i? e | = |i?head(e)|- 
This is because the incoming edges are duplicated everytime 
a node is duplicated. So, for e,e' E I(v) (two incoming 
edges of one node), we will have |i? e | = |i? e '| and the sets 



1-Re'l 



,} will 



R e = {ei,e 2 ,--- ,e| fle |} and R e > = {ei,e 2 ,-- 
be ordered such that Vi — head(ei) = head(e' ; )- 

For each e E 0(v), define the set R v e = {head(e') : e' E 
R e } to be the subset of R v that contains nodes whose outgoing 
edge is a copy of e in G' . Note that |i?^ !e | = \R e \. 

Let the vector f e = [/ e (l) / e (2) • • • / e (|5'|)] represent the 
edge function 53j=i feij)Xj sent over edge e E E in the 
final linear network code in G. Since the edge compatibility 
conditions are satisfied, the edge function on each copy of e 
in R e will be a scalar multiple of f e . For e" E R e , let the edge 
function on e" be f e » — c e (e")f e . We collect the multiplying 
factors c e (e"), e" E R e into a vector c e = [c e (e") : e" E R e ]- 
The correspondence between R e and i?head(e) results in a one- 
to-one correspondence between elements of the sets R\ 



head(e) 



for 



E Re 



and c e given by c e (e") -H> head(e") 

We define sub-vectors c e /. e = [c e /(e") : e" E R e ',e] 
collecting the multiplying factors on copies of e' that con- 
nect to copies of e. For a fixed e E 0(v) and e', e" E 

I(v) With R v-e — {Vx,V2,'" ) U |iJ„ e |}> me sets Re'.e = 



{ei.' 



•2< ' 



} and R e 



= {« 



1 ) c 2 ) ' 



} 



will be in one-to-one correspondence and ordered so that 

head(e-) = head(e") = Vi. So, we have |-R e ',e| = |-R e ",e| — 

\Rv,e\ = \Re\- 

In Fig. [6] for instance, we have R e 1 — {ei(l), ei(2), ei(3)}, 
R ei = {e 4 (l),e 4 (2)} and R es = {e 5 (l)}. Also, R eue4 = 
{ei(l),ei(2)} and R ei ,e 5 = {ei(3)}. Similarily, R e2 ,e 4 = 
{e2(l), e2(2)} and R e2 ,e 5 — {e2(3)}. The scaling vector 
c ei = [ax h cx] with c ei , e4 = [ax h] and c eiiE5 = [cx]. 
Similarily, c e2 = [02 f»2 C2] with c e2i64 = [02 62] and 
c e9,e 5 = [c2\- Note that all one-to-one correspondences are 
being preserved in the ordering of coordinates in the scaling 
vectors. 

Flow matrices at a vertex: For a vertex v, incoming edge 
e' E I(v) and outgoing edge e E 0(v), a rank-one flow matrix 
F e ' e is defined as F e / e = cj, e f e >. The matrix F e / e is of 
dimension |i? e ', e | x 15*1, and the (i,j)-th element F e ' >e (i,j) = 



>(et)f e ,(j) (letting R e ,, e = {e'{,e 



2»-" ' e |k e ,,j}) is to* 
coding coefficient of the j-th source symbol flowing in the 

i-th copy of edge e! in R e \ e - We readily see that each row of 

F P i _e is the coding vector in a copy of edge e' in G' . 

ai 



In Fig. 



for instance, we have F„ 



bx 



[Ax] = 



[an ai2 ai 3 ] and F ei>en = [cx][axi «i2 a 13 ]. In terms 



of path gain variables, F e ' e (i,j) is equal to the sum of the 
path gain variables for all paths starting from (some copy of) 
the j-th source and using the i-th copy of edge e' in R e '. e - 
Let I(v) = {ei,e 2 ,--- ,e d }. For e E 0(v), let |-R efc , e | = 



\R e \ = D (for all fc), and let R eh . e = {e' fcl , < 



•fc2' 



} with 



head(e' fe; ) = v[ E R v . e independent of k. The Z-th row of the 
flow matrix F ek) e contains the flow in the edge e' kl incident 
on the node v[ for 1 < fe < \S\. Therefore, the sum F e = 
Fe k: e is a D x I S\ matrix whose Z-th row is equal to the 
sum of all incoming flows into node v\. By flow conservation, 
the outgoing flow on the single outgoing edge from node v[ 
is equal to the l-th row of F e for 1 < / < D = \R e \. So, the 
rows of F e contain the flows in the D copies of the edge e 
in G", and the edge compatibility condition ensures that the 
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a\A\ + a 2 A 2 
+a 3 A 3 



c\A\ + c 2 A 2 
+c 3 A 3 



v in G 



c ei = [ai bi ci] c e2 = [a 2 b 2 c 2 ] c e3 = [03 b 3 c 3 ] 

a\Ai w a 2 A 2 a 3 A 3 b^Ai b 2 A 2 b^A 3 c^Ai c 2 A 2 93^3 

»(3V 




64(1) 




bxA x + b 2 A 2 + b 3 A 3 c x A x + c 2 A 2 

+c 3 A 3 



a\A\ + a 2 A 2 

+a 3 A 3 = k(a 1 A 1 + a 2 A 2 + a 3 A 3 

c e4 = [1 k] 

Copies of v in G' 



[1] 



Fig. 6. Determining the vectors f and c for outgoing links. 



matrix F e is a rank-one matrix. 

B. The Algorithm 

The vectors f e and c e are initialized for an outgoing link 
e from the source node as follows. For the i-th source node 
Sl £ S and e £ 0(s,), f e = [Q*" 1 1 Ol s l^]. For e" G i? e , the 
coordinate c e (e") of c e is equal to the value of the scaling 
variable at the source leaf node tail(e") G R Si . 

Algorithm 3: Deriving the Network Code 
Input: A directed acyclic network G — (V, E), an equivalent 
transformed network G' — (V , E'), a topological ordering 
of nodes P (from Algorithm [TJ and a solution to the derived 
system of polynomial equations. 

For each node, v in the reverse topological ordering, P' , of 

P, if v i S U T, do 

1) Get f e /,c e / from tail(e') V e' £ I(v). 

2) For each edge e £ 0(v) 

a) Get c e ' e from c e ' as defined above Ve'e -f(t>)« 



b) 
c) 



F e , >e <- c T e ,J e , Ve'e I(v). 

F P i P , is a matrix such that each 



Ee'e;(») 

row corresponds to the symbol flowing through a 
copy of edge e in G'. 

f e any non-zero row (say, i) of F e , or the zero 
row if F e is the zero matrix. This is the symbol 
that will actually flow through e in G. 
a e ',e c e ', e (i) V e' £ I(v), where i is the row 
selected in the previous step. This is the set of net- 
work coding coefficients of node v corresponding 
to output link e. 

c e (.?) <- (j th row of F e )/f e or if f e = V j = 
l,...,|c e |. 

The decoding coefficients at a sink node tj are given by the 
set {c e ; e € /(tj)}. Note that all the matrices in this set have 
only one element since there is only one copy of each sink 
node (and and all its input links) in G'. 
Output: The set of all network coding coefficients, a e / e , for 
the given network. 



d) 



e) 



f) 



In the above algorithm, nodes are travered in the reverse of 
the topological order obtained from Algorithm [T| At a node v, 
the vectors f e and c e are computed for e £ 0(v) using i e > and 
c e / for e' £ I(v). The reverse topological order ensures that 
f e < and c e i are known for e' £ I(v) before node v is visited. 

C. An Example 

We will now present an example of this algorithm applied to 
a sample solution for the modified butterfly network (Fig. El. 
Consider the following solution for the system over GF(A) = 



{0,l,«,a 2 }, 



' = 1 + a. 

a i =a 5 = b 2 = b 6 = 1 
a 2 = as = bi = 65 = 
a 3 = a 4 = a, b 3 = fe 4 = 



a 



One reverse topological order of edges is 1 — 2 — 3 — 4 — 5 — 
6 — 7 — 8 — 9 — 10. Nodes 1,2 are source nodes. So, we have 



f Cl = f e4 = [1 0], f e2 = f e5 = [0 1] and from the solution 
above, we have c ei = [0 a 1 0], c e — 1 1 ■ ' •• — in 1 1 

2 1]. 



= [1 a], c e2 = [0 1 a 2 0], 



c e5 = La- 
Then, beginning with the iteration for the non-source non- 
sink nodes as described in the algorithm above, we will first 
process node 3 - we know f e , c e for both its input links, ei,e 2 . 
There are 4 copies of this node as shown in Fig. |2}) and all 4 
copies have copies of edge e 3 as their only output links. Hence, 



we arrive at: 



ei an d c e 2 ,e 3 



c e2 . After computing F ei>e3 , F e . 



a 
1 



Now, let f e3 = [a 1], the second row of F e . 3 . Hence, we 
have a eil e 3 = a,a e2 . e3 = 1, the network coding coefficients 
at node 3. Also, c 63 — [0 1 a 2 0]. 

Next, we will move to node 4 which has two output links, 



ee, ej. Hence, we have c e3 
then compute F e3! e 6 ,F e3 ,e 
( 



, 6 = [0 l],c e3 , e7 = [a- 
and then arrive at: 



0]. We can 



Fe 7 = 



a 
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Now we can choose f e6 = [a 1], the second row of F e& and 
f e7 = [1 a 2 ], the first row of F ei . Then, the network coding 
coefficients for node 4 are a e3 , ee = 1, a e3 e? = a 2 . Completing 
the last step of the iteration, we get c efJ = [0 l],c e7 = [10]. 

Now we come to node 5. For the output link eg, we have 
c e4 , es = [l],c e6 , e8 = [0] and for e 9 , we have c eije9 = 
[a],c e6ieg = [1]. Then we get: 

F e8 =[10], F C9 = [0 1] 

So, f es = [1 0] and f eg = [0 1], the only rows of the 
respective matrices. Then, the network coding coefficients for 
node 5 are: 

^e4,eg — 1) ^e6,eg — 0> ^e4,eg — Q^^e6,eg 1 
AISO, C es = C e9 = [1]. 

Similarly, the network coding coefficients for node 6 can 
also be computed so that sinks 9, 10 receive the required 
symbols. 

VI. Conclusion 

In this work, we have used path gains as variables to arrive 
at an algebraic formulation for the scalar linear network coding 
problem. This provides a useful simplification of the edge- 
to-edge gain formulation proposed in [3|, as illustrated by 
both small and large-sized examples. Given a network coding 
problem, we have given algorithms to construct an equivalent 
transformed network and arrive at a system of polynomial 
equations (of maximum degree 2) in terms of path gains. After 
solving for the path gains, we have provided an algorithm 
to compute the edge-to-edge gains, which can be used in 
implementing the network code. 

Each monomial term occuring in a general system of 
polynomial equations can be assigned a new variable to obtain 
linear equations along with consistency conditions involving 
the new variables. However, in a general polynomial system, 
the consistency conditions are not guaranteed to be degree- 
2 equations without introducing additional monomial terms 
not present in the original system. Through this work, we 
have shown that the polynomial system representing a scalar 
network coding problem reduces to only degree-2 consistency 
conditions. 
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